AITopics | rate tuner

Collaborating Authors

rate tuner

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stepping on the Edge: Curvature Aware Learning Rate Tuners

Neural Information Processing SystemsMar-20-2026, 16:04:21 GMT

Curvature information -- particularly, the largest eigenvalue of the lossHessian, known as the sharpness -- often forms the basis for learning ratetuners. However, recent work has shown that the curvature information undergoescomplex dynamics during training, going from a phase of increasing sharpness toeventual stabilization. We analyze the closed-loop feedback effect betweenlearning rate tuning and curvature. We find that classical learning rate tunersmay yield greater one-step loss reduction, yet they ultimately underperform inthe long term when compared to constant learning rates in the full batch regime.These models break the stabilization of the sharpness, which we explain using asimplified model of the joint dynamics of the learning rate and the curvature.To further investigate these effects, we introduce a new learning rate tuningmethod, Curvature Dynamics Aware Tuning (CDAT), which prioritizes long termcurvature stabilization over instantaneous progress on the objective. In thefull batch regime, CDAT shows behavior akin to prefixed warm-up schedules on deeplearning objectives, outperforming tuned constant learning rates. In the minibatch regime, we observe that stochasticity introduces confounding effects thatexplain the previous success of some learning rate tuners at appropriate batchsizes. Our findings highlight the critical role of understanding the jointdynamics of the learning rate and curvature, beyond greedy minimization, todiagnose failures and design effective adaptive learning rate tuners.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Stepping on the Edge: Curvature A ware Learning Rate Tuners

Neural Information Processing SystemsFeb-13-2026, 14:36:34 GMT

(Liu and Nocedal, 1989). Similar efforts have been made for Polyak stepsizes (Berrada et al., 2020; Loizou et al., 2021), in addition to new methods which combine distance to optimality with online learning convergence bounds (Cutkosky et al., 2023; Classically-inspired methods, however, have generally struggled to gain traction in deep learning.

artificial intelligence, machine learning, regime, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Education > Educational Setting > Online (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

555479a201da27c97aaeed842d16ca49-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 02:55:32 GMT

learning rate, rate tuner, regime, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Stepping on the Edge: Curvature Aware Learning Rate Tuners

Roulet, Vincent, Agarwala, Atish, Grill, Jean-Bastien, Swirszcz, Grzegorz, Blondel, Mathieu, Pedregosa, Fabian

arXiv.org Artificial IntelligenceJul-8-2024

Curvature information -- particularly, the largest eigenvalue of the loss Hessian, known as the sharpness -- often forms the basis for learning rate tuners. However, recent work has shown that the curvature information undergoes complex dynamics during training, going from a phase of increasing sharpness to eventual stabilization. We analyze the closed-loop feedback effect between learning rate tuning and curvature. We find that classical learning rate tuners may yield greater one-step loss reduction, yet they ultimately underperform in the long term when compared to constant learning rates in the full batch regime. These models break the stabilization of the sharpness, which we explain using a simplified model of the joint dynamics of the learning rate and the curvature. To further investigate these effects, we introduce a new learning rate tuning method, Curvature Dynamics Aware Tuning (CDAT), which prioritizes long term curvature stabilization over instantaneous progress on the objective. In the full batch regime, CDAT shows behavior akin to prefixed warm-up schedules on deep learning objectives, outperforming tuned constant learning rates. In the mini batch regime, we observe that stochasticity introduces confounding effects that explain the previous success of some learning rate tuners at appropriate batch sizes. Our findings highlight the critical role of understanding the joint dynamics of the learning rate and curvature, beyond greedy minimization, to diagnose failures and design effective adaptive learning rate tuners.

learning rate, rate tuner, regime, (11 more...)

arXiv.org Artificial Intelligence

2407.06183

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Slovakia > Bratislava > Bratislava (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Energy (0.48)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback